<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Path Hierarchy Tokenizer | ElasticSearch 7.7 权威指南中文版</title>
	<meta name="keywords" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <meta name="description" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
	<link rel="stylesheet" type="text/css" href="../static/styles.css" />
	<script>
	var _link = 'analysis-pathhierarchy-tokenizer.html';
    </script>
</head>
<body>
<div class="main-container">
    <section id="content">
        <div class="content-wrapper">
            <section id="guide" lang="zh_cn">
                <div class="container">
                    <div class="row">
                        <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                            <div style="color:gray; word-break: break-all; font-size:12px;">原英文版地址: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.7/analysis-pathhierarchy-tokenizer.html" rel="nofollow" target="_blank">https://www.elastic.co/guide/en/elasticsearch/reference/7.7/analysis-pathhierarchy-tokenizer.html</a>, 原文档版权归 www.elastic.co 所有<br/>本地英文版地址: <a href="../en/analysis-pathhierarchy-tokenizer.html" rel="nofollow" target="_blank">../en/analysis-pathhierarchy-tokenizer.html</a></div>
                        <!-- start body -->
                  <div class="page_header">
<strong>重要</strong>: 此版本不会发布额外的bug修复或文档更新。最新信息请参考 <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html" rel="nofollow">当前版本文档</a>。
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch Guide [7.7]</a></span>
»
<span class="breadcrumb-link"><a href="analysis.html">Text analysis</a></span>
»
<span class="breadcrumb-link"><a href="analysis-tokenizers.html">Tokenizer reference</a></span>
»
<span class="breadcrumb-node">Path Hierarchy Tokenizer</span>
</div>
<div class="navheader">
<span class="prev">
<a href="analysis-ngram-tokenizer.html">« N-gram tokenizer</a>
</span>
<span class="next">
<a href="analysis-pathhierarchy-tokenizer-examples.html">Path Hierarchy Tokenizer Examples »</a>
</span>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="analysis-pathhierarchy-tokenizer"></a>Path Hierarchy Tokenizer<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc">edit</a>
</h2>
</div></div></div>
<p>The <code class="literal">path_hierarchy</code> tokenizer takes a hierarchical value like a filesystem
path, splits on the path separator, and emits a term for each component in the
tree.</p>
<h3>
<a id="_example_output_15"></a>Example output<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc">edit</a>
</h3>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">POST _analyze
{
  "tokenizer": "path_hierarchy",
  "text": "/one/two/three"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/873.console"></div>
<p>The above text would produce the following terms:</p>
<div class="pre_wrapper lang-text">
<pre class="programlisting prettyprint lang-text">[ /one, /one/two, /one/two/three ]</pre>
</div>
<h3>
<a id="_configuration_16"></a>Configuration<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc">edit</a>
</h3>
<p>The <code class="literal">path_hierarchy</code> tokenizer accepts the following parameters:</p>
<div class="informaltable">
<table border="0" cellpadding="4px">
<colgroup>
<col>
<col>
</colgroup>
<tbody valign="top">
<tr>
<td valign="top">
<p>
<code class="literal">delimiter</code>
</p>
</td>
<td valign="top">
<p>
The character to use as the path separator.  Defaults to <code class="literal">/</code>.
</p>
</td>
</tr>
<tr>
<td valign="top">
<p>
<code class="literal">replacement</code>
</p>
</td>
<td valign="top">
<p>
An optional replacement character to use for the delimiter.
Defaults to the <code class="literal">delimiter</code>.
</p>
</td>
</tr>
<tr>
<td valign="top">
<p>
<code class="literal">buffer_size</code>
</p>
</td>
<td valign="top">
<p>
The number of characters read into the term buffer in a single pass.
Defaults to <code class="literal">1024</code>.  The term buffer will grow by this size until all the
text has been consumed.  It is advisable not to change this setting.
</p>
</td>
</tr>
<tr>
<td valign="top">
<p>
<code class="literal">reverse</code>
</p>
</td>
<td valign="top">
<p>
If set to <code class="literal">true</code>, emits the tokens in reverse order.  Defaults to <code class="literal">false</code>.
</p>
</td>
</tr>
<tr>
<td valign="top">
<p>
<code class="literal">skip</code>
</p>
</td>
<td valign="top">
<p>
The number of initial tokens to skip.  Defaults to <code class="literal">0</code>.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<h3>
<a id="_example_configuration_9"></a>Example configuration<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc">edit</a>
</h3>
<p>In this example, we configure the <code class="literal">path_hierarchy</code> tokenizer to split on <code class="literal">-</code>
characters, and to replace them with <code class="literal">/</code>.  The first two tokens are skipped:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "path_hierarchy",
          "delimiter": "-",
          "replacement": "/",
          "skip": 2
        }
      }
    }
  }
}

POST my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "one-two-three-four-five"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/874.console"></div>
<p>The above example produces the following terms:</p>
<div class="pre_wrapper lang-text">
<pre class="programlisting prettyprint lang-text">[ /three, /three/four, /three/four/five ]</pre>
</div>
<p>If we were to set <code class="literal">reverse</code> to <code class="literal">true</code>, it would produce the following:</p>
<div class="pre_wrapper lang-text">
<pre class="programlisting prettyprint lang-text">[ one/two/three/, two/three/, three/ ]</pre>
</div>
<h3>
<a id="_detailed_examples"></a>Detailed Examples<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc">edit</a>
</h3>
<p>See <a class="xref" href="analysis-pathhierarchy-tokenizer-examples.html" title="Path Hierarchy Tokenizer Examples">detailed examples here</a>.</p>
</div>
<div class="navfooter">
<span class="prev">
<a href="analysis-ngram-tokenizer.html">« N-gram tokenizer</a>
</span>
<span class="next">
<a href="analysis-pathhierarchy-tokenizer-examples.html">Path Hierarchy Tokenizer Examples »</a>
</span>
</div>
</div>

                  <!-- end body -->
                        </div>
                        <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                        
                        </div>
                    </div>
                </div>
            </section>
        </div>
    </section>
</div>
<script src="../static/cn.js"></script>
</body>
</html>